| notebook.community

Linux and Command Line Computing ala POSIX

An introduction to Linux, POSIX and command line yumminesss.

By Richard Larkin

Why?

Although using GUI's (Graphicl User Interfaces) is easier and more popular, using the command line to control computers has several advanges:

can issue infintiely more commands than are visually presented.
easier to reproduce and communicate.
support scripting.
more flexible - supports chaining of commands i.e. feeding the output of one command to another.
can work without a GUI (fewer resources) or mouse.
can be used remotely (ssh).

The Rise of Linux

Linux became the standard for server side computing for various reasons, such as:

its small and lightweight.
it does not require a graphical interface (suitable for servers).
its flexible and extensible (small tools chained together).
its free i.e. in legal terms, you own your own copy.
its open source i.e. code can be inspected, checked and changed.

The freedom to change and modify the source code was fundamental as companies could easily re-use the code for their own purposes. It removed any dependencies on commercial vendors and prevented solution "lock-in".

Teminals, Shells and Consoles

These terms are often used interchageably, and to some extent, they are interchangeable. But there is a difference. In summary:

terminal - tty = text input/output environment.
console - physical terminal i.e. a concrete implementation of a terminal.
shell - command interpreter for unix like execution environments.

Shells are probably the most important to distinguish because they define our commands behviour.

Terminals as Cross Platform Interfaces

Although Operating Systems vary wildly in their technical implementation, most expose a POSIX compatible interface via the command line.

POSIX stands for Portable Operating System Interface, and is an IEEE (Institute of Electrical and Electronics Engineers ) standard designed to facilitate application portability. POSIX is an attempt by a consortium of vendors to create a single standard version of UNIX.

POSIX defines the application programming interface (API), along with command line shells and libraries, for software compatibility with variants of Unix and other operating systems.

Please see the Wikipedia POSIX page for more on POSIX.

Getting Started

There are many choices of Terminal applicatons. Common ones are:

terminal - built in MacOS terminal.
iTerm - more customizable terminal for MacOS.
hyper - modern, pretty, HTML5 based terminal .
kitty - pretty, customizable, cross platform.
guake - advanced linux terminal with neat desktop intgration.
many, many others. See the Wikipedia Terminal Emulator page for more.

Choose one. They all work. For this demo, we'll use the one provide by this Jupyter notebook. Because Jupyter rules!

Shells

Shells we care about most because they decide the functionailty of our terminal. i.e. The commands we use are provided by our shell.

There are most common shells are:

bash - ubuntu default, MacOS default before Catlalina..
zsh - more modern, feature rich. New default for MacOS Catalina.
sh - Bourne shell, one of the oldest.
ssh - Secure shell, still widely used for secure, remote terminal access.
cmd.exe - Windows command line utility. Not technically a shell, but provides similar functionlity on windows.
powershell - modern Windows utility with powerful programming language integration.
many more, but of lesser importance.

Shells allows us to issue commands to the underlying operating system, be that for creating files, moving files, invoking programs or manipulating them.

Where? What? Who? How?

Let's get started with some basic commands.

pwd - "Present working directory" - gives your current active directory in the file system.
whoami - gives the current username that you are logged in with.
which <cmd> - gives the full path to the command specified.
man <cmd> - display the help and options for a given command.

Common Useful Keyboard Shortcuts

Although each terminal application tends to have it's own shortcut keys and configuration, most support these common shortcuts keys.

up arrow - show your previous command.
tab - autocomplete the file name if possible.
control + r - open an interactive search over your previous commands.
alt + right arrow - move forward one word.
alt + left arrow - move backwards on word.

Directories

The following commands help you navigate, create and remove directories.

cd <dir> - change your active directory to the one specified.
- cd . - moved one directory up.
- cd - - move back to the last directory you where in.
- cd ~ - move to the current users home folder.
- cd / - move to the root of the file system.
mkdir <dir> - create a directory.
rm -rf <dir> - remove a directory and everything in it. Powerful! Be careful!
mv <source_dir> <dest_dir> - move or rename a folder.

Hint: Use tab autocompletion to check valid folder names!

Files

The command line gives us tremendous power to:

create, view, extract and process files.
pass the output of one command to another commands.
chain commands and outputs together to perform powerful:
- searches.
- modifications.
- replacemets.
- extractions.

These abilities emerge from the unix philosophy of small, dedicated, focused utilities which can be combined and used togther. In this way, we have a tool where the result is greater than the sum of it's parts.

Creating and Writing to Files

File manipulation forms the basis of how we interact with and manipulte stored data.

Creating

touch <file_name> - create a new file with the given name.
> <file_name> - a more concise way to create a file.
echo "<text>" > <file_name> - create a file and fill it with the given <text>
cp <source> <dest> - copy the <source> file to the <dest>

Writing

echo "<text>" >> <file_name> - append the <text> to the end of the file.
echo -e "<text>" >> <file_name> - append the <text> to the file as a new line.

Of course, you can also use editors to create and write to files, but with these tools, you can script it.

Reading and Finding Files

Most jobs involve finding, reading and proces data in files. For that, we can use the following utilities.

Reading

cat <file_name> - show the contents of a file.
head <file_name> - show the first 10 lines of a file.
tail <file_name> - show the last 10 lines of a file.
more <file_name> - open a file and page forwards through it.
less <file_name> - open a file and page forwards and backwards through it.

Finding

find <option> <path> <expression> - search for files in the requested <path> that match the <expressions> and prints their paths. For example:
```
find /home/Pictures -name *.jpg
```
would print the full paths to all the files with a "jpg" extension in the "/home/Pictures" folder.

find can also be used together with grep to find files that contain certain text. For example:

find . -type f -exec grep "example" '{}' \; -print

would search all the files in the current directory and dispaly the filename and line of any that contain the text "example".

File System Attributes and Permissions

In order to manage access rights and permssions, Operating systems generally attached attributes to files. We can see these using the -l flag for the ls command.

ls -l <folder> - display the contents of the folder with it's permissions.

These attributes are shown in 10 columns.

The first column indicates whether it's a directory, a symlink or a file.
- d - a directory
- l - a symlink (see symlink discussion later)
- - - a normal file
the next 9 columns show read, write and execute permissions respectively.
- r - readable
- w - writable
- x - executable
each group of three lists the permisions for the user (owner), the group and other (everyone else) respectively.

e.g.

-rw-r--r--   1 richard  staff    25 Apr  6 21:05 newfile.txt

indicates that for newfile.txt, the user can read and write to it, but the group and everyone can only read it. This is the default when creating a new file.

Changing File Permissions

We can change file permssions using the chmod command.

chmod [options] <permissions> <file/folder> - sets the attributes on a given file or folder.

e.g.

chmod u=rwx,g=rx,o=r myfile

would give the user read, write and execute permissions, the users group read and execute permssions and others read only permissions.

For a full discussion of chmod, see the Wikipedia page

Command Line Text Editors

There are many command line text editors for creating and editing files on disk. So of the more common ones are:

vi - simplest and most prevalent editor. Useful as it's present in most containers ad operating systems. Uses a "macro" language.
vim - vi enhanced, with more features and functions.
nano - another lightweight editor.
emacs - another highly popular editor, but can also perform file management, terminal emulation and more.

For a complete list, see the Wikipedia Text Editor page

Connecting Commands

The commands we've looked at so far get us started, but do nothing special on their own. To unleash the beastly power of the commnad line, we need a way to make them work together. The simplest method is the pipe operator.

The Pipe

| - the "pipe" operator directs the output of one command to the input of another. e.g. the following example feeds the output of the ls command into the more command, allowing us to page through a directory listing.
```
$ ls /etc | more
```

The pipe is crucial in allowing us to feed to the output of one command as the input into another command using standard input and output i.e. stdin and stdout.

Connecting Commands via Arguments

Some commands or programs require input via arguments, not via stdin or stdout. For these, we can use xargs.

xargs <command> - take input from stdout and feed it, as arguments to <command>. e.g.
```
echo 'one two' | xargs mkdir
```
will create two directories, "one" and "two". To create a single directory called "one two", we would escape the space.
```
echo 'one\ two' | xargs mkdir
```

In another example, the following line would create a new file for each line of filelist.txt.

cat filelist.txt | xargs touch

Processing Commands with "grep"

Next, we need commands to help as search, extract and manipulate data.

Regular Expressions

Regular expressions give us a powerful, flexible and simple way to search text. For this, we have the grep command - "Get REgular exPression"

grep <pattern> <files> - search through the <files> for lines which match the <pattern>.

It is often used a filter, to process the output of one command and present only the items we are interested in. e.g.

ls /etc | grep sudo

would give us a listing of only the files in the /etc folder that contain the string "sudo"

Processing Commands with "awk"

awk is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

awk <options> '<selection _criteria> {<action>}' <input-file> > <output-file> - Take the <input_file>, search for text matching the <selection_criteria>, apply the <action> and put the results in the <output_file>

That quite a lengthy command, but in practice, it's quite simple to use. For example:

awk '{print NR,$0}' employee.txt

prints each line of the "exployee.txt" file with the corresponding line number before it. awk is powerful tool and contains way more abilities than can be discussed here. Please see the Wikipedia AWK article for more detail.

Changing Files In Place

It's a common use case that we want to replace all instances of string in a file with another string. This can be done using the sed command: the "Stream EDitor".

sed <options> <file_name> - perform insertion, deletion, search or replace on the given <file_name>. e.g.
```
sed 's/bobs/zibs/' test.txt > newfile.txt
```
would replace all "bobs" in the "text.txt" file with "zibs" and store the results in "newfile.txt". To perform the replacement in the original file, we can use the -i option.
```
sed -i 's/bobs/zibs/g' test.txt
```

Note: For some reason, on MacOS, we need an '' after -i i.e. sed -i '' 's/bobs/zibs/g' test.txt

Creating Scripts

Scripts are simply text files that contain instructions that can be understood or executed by an internal command or interpreter.

As an example, we can create a simple zsh script as follows:

#! /bin/zsh
<instruction line 1>
<instruction line 2>

e.g.

#! /bin/zsh
ls | echo "This contents of this folder are:"

This file, when executed, will print "The contents of this folder are:" followed by a listing of the files and folders in the current directory.

Note: Although the shebang (#!) is not strictly required, it tells the operating system it is an exectable and how to execute it. There are many reasons to use it. For further reading, see the Wikipedia Shebang article).

Invoking Scripts

There are two main ways to execute scripts.

manually invoke the script by passing them as arguments to a command e.g. sh scipt.sh or python main.py.
by marking the script as executable and invoking it directly e.g.
```
chmod +x script.sh
./script.sh
```

Note: when using the second approch, notice that we specify the current folder for the scripts location, otherwise the OS will typically only look in the systems PATH. This is deliberate and for security reasons. Without this behavior, locally placed files could maliciously intercept commands and alter the desired behavior.

Symlinks - "With great power...."

Symlinks are symbolic links. They appear as file system objects but are effectively just pointers to other files and folders. They are useful for accessing the same file from various places or paths. Instead of making "copies", a symlink just "points" to the original file and makes it appear like it is in multiple places!

ln -s <file/folder> <linkname> - create a symbolic to the <file/folder> called <linkname>.

Warning: Symbolic links are very powerful in terms of linking, sharing and preventing data duplication, but they also make it very easy to delete and modify data unintentionally. Beware.

Environment Variables

Environment variables provide us with important context regarding our execution environment. They provide access to variables that describe and expose our system configuration.

env - print a list of all our enironment variables
$<env_var_name> - insert the value of the named environment variable here e.g.
```
echo "My name is slim shady, I mean $USER."
```

Environment variables are also very useful for keeping confidential values and secrets separate from our scripts. They allow us to use these values without actually having their values in the scripts.

Modifying Your Shell with "The Source"

By default, your shell will contain only commands provided by your standard shell. It is possible and indeed, highly desireable, to load new sets of commands into your shell. For this magic, we use some tasty sause!

source <file_name> - Read and execute commands from <file_name> in the current shell environment.

Note: Some operating systems also provide . as shortcut for this i.e. . <file_name>

This allows us to load aliases and custom commands into our shell, and is often used in python to activate our virtual environments.

source venv/bin/activate

Aliases

Aliases give you a way to define your own commands by combining other commands. They are useful if you use similar sets of commands often, or want to simplify the use of complex commands.

alias <name>='<commands>' - Give the series of <commands> the alias <name>.

For example:

alias makeenv="rm -rf venv/ && python3 -m venv venv && source venv/bin/acivate && pip install -r requirements.txt"

Now when you type makeenv, a new python3 virtual environment will be created and the requirements installed into it.

Summary

This talk only touches the surface of what is possible with the tools available via POSIX compatible commands line utilities. It is an introduction, and far from comprehensive or detailed.

Each one of these topics coud be a talk by iteself, but we hope this at least gives us a starting point for the wonderful journey into the deep, dark world that is the command line!

Go forth and command!

Thank you

Note: This notebook is available under an MIT License and can be downloaded from Zen-CODE's github repository.